Skip to content

Route only apt-packages/container legs to GitHub-hosted (scoped)#97

Merged
ChrisRackauckas merged 1 commit into
SciML:masterfrom
ChrisRackauckas-Claude:default-runner-github-hosted
Jun 16, 2026
Merged

Route only apt-packages/container legs to GitHub-hosted (scoped)#97
ChrisRackauckas merged 1 commit into
SciML:masterfrom
ChrisRackauckas-Claude:default-runner-github-hosted

Conversation

@ChrisRackauckas-Claude

Copy link
Copy Markdown

Scoped: route only apt-packages/container legs to GitHub-hosted

Note: please ignore until reviewed by @ChrisRackauckas.

Re-scope of the now-closed #96 (which proposed pinning the default runner to ubuntu-24.04). That was wrong — it would have moved the whole SciML fleet's CI onto GitHub-hosted runners, which lack the capacity for SciML's throughput. #96 could not be reopened (its branch was force-pushed after it was closed), so this PR — on the same branch default-runner-github-hosted — is its live successor.

Runner facts (unchanged)

The SciML self-hosted pool (demeter*/arctic*, ephemeral *-cxnps-*) is registered with the custom label ubuntu-latest — the same label GitHub-hosted runners answer. Its full label set is {self-hosted, Linux, X64, gpu, high-memory, ubuntu-latest}; it does not carry the pinned ubuntu-24.04 label. So:

  • ubuntu-latest → self-hosted-capable (kept as the default for throughput).
  • ubuntu-24.04 → GitHub-hosted only (has passwordless sudo + docker).

What this changes

Keep ubuntu-latest as the default for normal test/downgrade legs, and force GitHub-hosted only for the legs that genuinely need passwordless sudo / docker — i.e. exactly the legs where the caller passes apt-packages or a container. The persistent demeter*/arctic* runners lack passwordless sudo, so sudo apt-get (apt provisioning) intermittently fails with sudo: a terminal is required to read the password whenever such a leg lands there (ChrisRackauckas/InternalJunk#52); containers likewise need a Docker host.

Conditional added at the reusable job's runs-on:

runs-on: ${{ (inputs.apt-packages != '' || inputs.container != '') && fromJSON('["ubuntu-24.04"]') || <existing default> }}

fromJSON('["ubuntu-24.04"]') is a non-empty (truthy) array, so the GitHub Actions &&/|| ternary does not fall through; the default branch returns the existing value (an array via fromJson(inputs.runner), or a string 'self-hosted' / inputs.os / 'ubuntu-latest'). Both branches are valid runs-on forms (array or string) — the same idiom tests.yml already used for the runner/os selection. actionlint validates it (exit 0).

Where it landed

The reusables that accept apt-packages/container and have a direct job-level runs-on:

Reusable Job Default branch
tests.yml tests (leaf) fromJson(inputs.runner) / self-hosted / inputs.os (ubuntu-latest)
downgrade.yml downgrade self-hosted / inputs.os (ubuntu-latest)
sublibrary-downgrade.yml test ubuntu-latest

grouped-tests.yml and sublibrary-project-tests.yml route their matrices through tests.yml (passing apt-packages/container through), so the single conditional in tests.yml covers them.

Reverted from #96 / left unchanged on purpose

  • scripts/compute_affected_sublibraries.jl default runner and the os defaults in tests.yml/downgrade.yml are back to ubuntu-latest.
  • detect/discover helper jobs (grouped-tests.yml, sublibrary-project-tests.yml, sublibrary-downgrade.yml discover) stay ubuntu-latest.
  • sublibrary-project-tests.yml exposes no apt-packages/container input and passes none through, so its matrix legs never need the override.
  • Groups with an explicit self-hosted runner (GPU) or an OS-axis override set no apt-packages/container, so they stay self-hosted / on their OS.

Affected repos

The only repos passing apt-packages/container today:

  • SciPyDiffEq.jlapt-packages: "python3-scipy"
  • deSolveDiffEq.jlapt-packages: "r-base-dev r-cran-desolve"
  • FEniCS.jlcontainer: "cmhyett/julia-fenics:latest"

Tests

test/runtests.jl re-asserts the default matrix runner is ubuntu-latest, and a new runs-on conditional testset confirms the real expression is present in each reusable and (emulating GitHub Actions truthiness) that apt-packages/container resolve to ubuntu-24.04 while the default (incl. GPU self-hosted override) is preserved. Passing on Julia 1.10 and 1.12; actionlint clean. Live routing itself can only be proven by a retagged run.

Deploy

Needs a v1 retag to take effect fleet-wide.

🤖 Generated with Claude Code

…fleet-wide)

The SciML self-hosted pool (demeter*/arctic*, ephemeral *-cxnps-*) is
registered with the custom label `ubuntu-latest`, the same label GitHub-hosted
runners answer; the pinned `ubuntu-24.04` label is carried ONLY by
GitHub-hosted runners. The earlier approach of pinning the DEFAULT test runner
to `ubuntu-24.04` is wrong: it would move the WHOLE fleet's CI onto
GitHub-hosted, which lacks the capacity for SciML's throughput.

Instead, keep the self-hosted-capable `ubuntu-latest` as the default for normal
test/downgrade legs, and force GitHub-hosted ONLY for the jobs that genuinely
need passwordless sudo / docker -- i.e. exactly the legs where the caller passes
`apt-packages` or a `container`. The persistent demeter*/arctic* runners lack
passwordless sudo, so `sudo apt-get` (apt-packages provisioning) intermittently
fails with "sudo: a terminal is required to read the password" whenever such a
leg lands on one (ChrisRackauckas/InternalJunk#52); containers likewise need a
Docker-capable host.

Implementation: a conditional at the reusable job's `runs-on`:

    runs-on: ${{ (inputs.apt-packages != '' || inputs.container != '')
                 && fromJSON('["ubuntu-24.04"]')
                 || <existing default> }}

`fromJSON('["ubuntu-24.04"]')` returns a non-empty (truthy) array, so the
GitHub Actions &&/|| ternary does not fall through; the default branch returns
the existing value (an array via fromJson(inputs.runner), or a string
'self-hosted'/inputs.os/'ubuntu-latest'). Both branches are valid `runs-on`
forms (array or string), matching the idiom tests.yml already used.

Landed in the reusables that accept apt-packages/container AND have a direct
job-level runs-on:
  - tests.yml          (leaf test job; grouped-tests.yml and
                        sublibrary-project-tests.yml route their matrices
                        through here, and direct callers pass apt/container here)
  - downgrade.yml      (downgrade job)
  - sublibrary-downgrade.yml (test job; discover helper stays ubuntu-latest)

Left unchanged on purpose:
  - The default in compute_affected_sublibraries.jl and the os defaults in
    tests.yml/downgrade.yml are back to `ubuntu-latest` (reverted from the
    fleet-wide pin).
  - detect/discover helper jobs (grouped-tests.yml, sublibrary-project-tests.yml,
    sublibrary-downgrade.yml discover) stay `ubuntu-latest`.
  - sublibrary-project-tests.yml exposes no apt-packages/container input and
    passes none through, so its matrix legs never need the override.
  - Groups with an explicit self-hosted `runner` (GPU) or an OS-axis override
    set no apt-packages/container, so they stay self-hosted / on their OS.

Affected repos (the only ones passing apt-packages/container today):
SciPyDiffEq.jl (apt-packages "python3-scipy"), deSolveDiffEq.jl (apt-packages
"r-base-dev r-cran-desolve"), FEniCS.jl (container "cmhyett/julia-fenics:latest").

Tests: test/runtests.jl re-asserts the default matrix runner is `ubuntu-latest`,
and a new "runs-on conditional" testset confirms the real expression is present
in each reusable and (emulating GitHub Actions truthiness) that apt-packages /
container resolve to ubuntu-24.04 while the default (incl. GPU self-hosted
override) is preserved. Passing on Julia 1.10 and 1.12; actionlint clean.

Needs a v1 retag to take effect fleet-wide. Live routing can only be confirmed
by a retagged run.

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ChrisRackauckas ChrisRackauckas marked this pull request as ready for review June 16, 2026 09:51
@ChrisRackauckas ChrisRackauckas merged commit 9c897dd into SciML:master Jun 16, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants